translated by 谷歌翻译
translated by 谷歌翻译
In training neural networks, batch normalization has many benefits, not all of them entirely understood. But it also has some drawbacks. Foremost is arguably memory consumption, as computing the batch statistics requires all instances within the batch to be processed simultaneously, whereas without batch normalization it would be possible to process them one by one while accumulating the weight gradients. Another drawback is that that distribution parameters (mean and standard deviation) are unlike all other model parameters in that they are not trained using gradient descent but require special treatment, complicating implementation. In this paper, I show a simple and straightforward way to address these issues. The idea, in short, is to add terms to the loss that, for each activation, cause the minimization of the negative log likelihood of a Gaussian distribution that is used to normalize the activation. Among other benefits, this will hopefully contribute to the democratization of AI research by means of lowering the hardware requirements for training larger models.
translated by 谷歌翻译
Recent diffusion-based AI art platforms are able to create impressive images from simple text descriptions. This makes them powerful tools for concept design in any discipline that requires creativity in visual design tasks. This is also true for early stages of architectural design with multiple stages of ideation, sketching and modelling. In this paper, we investigate how applicable diffusion-based models already are to these tasks. We research the applicability of the platforms Midjourney, DALL-E 2 and StableDiffusion to a series of common use cases in architectural design to determine which are already solvable or might soon be. We also analyze how they are already being used by analyzing a data set of 40 million Midjourney queries with NLP methods to extract common usage patterns. With this insights we derived a workflow to interior and exterior design that combines the strengths of the individual platforms.
translated by 谷歌翻译
Rowhammer is a serious security problem of contemporary dynamic random-access memory (DRAM) where reads or writes of bits can flip other bits. DRAM manufacturers add mitigations, but don't disclose details, making it difficult for customers to evaluate their efficacy. We present a tool, based on active learning, that automatically infers parameter of Rowhammer mitigations against synthetic models of modern DRAM.
translated by 谷歌翻译
Generalized Labeled Multi-Bernoulli (GLMB) densities arise in a host of multi-object system applications analogous to Gaussians in single-object filtering. However, computing the GLMB filtering density requires solving NP-hard problems. To alleviate this computational bottleneck, we develop a linear complexity Gibbs sampling framework for GLMB density computation. Specifically, we propose a tempered Gibbs sampler that exploits the structure of the GLMB filtering density to achieve an $\mathcal{O}(T(P+M))$ complexity, where $T$ is the number of iterations of the algorithm, $P$ and $M$ are the number hypothesized objects and measurements. This innovation enables an $\mathcal{O}(T(P+M+\log(T))+PM)$ complexity implementation of the GLMB filter. Convergence of the proposed Gibbs sampler is established and numerical studies are presented to validate the proposed GLMB filter implementation.
translated by 谷歌翻译
Barlow Twins自制学习目标既不需要负样本或不对称的学习更新,从而与计算机视觉中当前最新艺术相提并论。因此,我们提出了音频Barlow双胞胎,这是一种新颖的自我监督音频表示方法,将Barlow Twins适应音频域。我们在大规模音频数据集音频集上预先培训,并评估来自2021年HEAR 2021挑战的18个任务的学习表现质量,从而取得了超越或以其他方式与当前最新的结果相同的结果。 - 例如,歧视自我监督的学习方法来表示音频表示学习。https://github.com/jonahanton/ssl_audio上的代码。
translated by 谷歌翻译
在这项工作中,我们探讨了对物体在看不见的世界中同时本地化和映射中的使用,并提出了一个对象辅助系统(OA-Slam)。更确切地说,我们表明,与低级点相比,物体的主要好处在于它们的高级语义和歧视力。相反,要点比代表对象(Cuboid或椭圆形)的通用粗模型具有更好的空间定位精度。我们表明,将点和对象组合非常有趣,可以解决相机姿势恢复的问题。我们的主要贡献是:(1)我们使用高级对象地标提高了SLAM系统的重新定位能力; (2)我们构建了一个能够使用3D椭圆形识别,跟踪和重建对象的自动系统; (3)我们表明,基于对象的本地化可用于重新初始化或恢复相机跟踪。我们的全自动系统允许对象映射和增强姿势跟踪恢复,我们认为这可以极大地受益于AR社区。我们的实验表明,可以从经典方法失败的视点重新定位相机。我们证明,尽管跟踪损失损失,但这种本地化使SLAM系统仍可以继续工作,而这种损失可能会经常发生在不理会的用户中。我们的代码和测试数据在gitlab.inria.fr/tangram/oa-slam上发布。
translated by 谷歌翻译
多发性硬化症(MS)是一种慢性进行性神经系统疾病,其特征是大脑白质病变的发展。相对于其他MRI模态,T2流体体面的反转恢复(FLAIR)脑磁共振成像(MRI)提供了MS病变的卓越可视化和表征。 MS中的纵向脑感状MRI,涉及随着时间的推移重复对患者进行成像,为临床医生提供了有用的信息,以监测疾病进展。仅在有限的应用中尝试预测未来的整个大脑MRI检查,例如在有限的应用中,例如在阿尔茨海默氏病中的健康衰老和结构性变性。在本文中,我们为MS Flair图像合成的深度学习体系结构提供了新的修改,以支持以灵活的连续方式支持纵向图像的预测。这是通过学习的转移卷积来实现的,该卷积将建模时间作为空间分布的阵列,在不同的空间位置具有可变的时间特性。因此,这种方法理论上可以对空间特定的时间依赖性大脑发育进行建模,从而支持在适当的物理位置(例如MS脑损伤部位)建模更快的生长。这种方法还支持临床医生用户定义预测考试应针对的未来。对未来成像的准确预测可以为临床医生提供潜在的患者预后,这可能有助于早期治疗和更好的预后。已经开发了四个不同的深度学习体系结构。 ISBI2015纵向MS数据集用于验证和比较我们提出的方法。结果表明,修改后的ACGAN可实现最佳性能并降低模型准确性的可变性。
translated by 谷歌翻译
多发性硬化症(MS)是一种慢性神经系统疾病,其特征是大脑白质病变的发展。相对于其他MRI模态,T2流体减弱的反转恢复(FLAIR)脑磁共振成像(MRI)提供了MS病变的卓越可视化和表征。 MS中的后续大脑FLAIR MRI为临床医生提供了有用的信息,以监测疾病进展。在这项研究中,我们提出了对生成对抗网络(GAN)的新颖修饰,以预测MS以固定时间间隔的MS预测未来病变特异性MRI。我们在鉴别器中使用受监督的引导注意力和扩张卷积,该歧视者支持对生成图像是否实现的明智预测,这是基于对病变区域的关注,这反过来又有可能帮助改善生成器以预测病变区域将来的考试更准确。我们将我们的方法与几个基线和一种最先进的CF-Sagan模型进行了比较[1]。总之,我们的结果表明,与其他总体性能相似的模型相比,所提出的方法可实现更高的准确性,并减少病变区域预测误差的标准偏差。
translated by 谷歌翻译